Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test-operator] Move away from Jobs to Pods #2663

Closed

Conversation

lpiwowar
Copy link
Contributor

With this PR [1] the test operator dropped the usage of OCP Jobs for
spawning of the test pods.

This change updates the test-operator role so that it works with the
new version of the test-operator that spawns test pods directly through
the OCP Pods object.

[1] openstack-k8s-operators/test-operator#266

fultonj and others added 30 commits December 17, 2024 17:20
The Ceph Upgrade tasks in the cifmw_cephadm role
will fail before the upgrade starts if the health
status is warn or error.

This patch changes it so that the upgrade only fails
if the cluster is in health error.

We have had the job fail in CI but we do not know why.
The task should log the Ceph health before starting the
upgrade so that CI results will give the job owner more
insight into why the job failed.

Signed-off-by: John Fulton <[email protected]>
test-operator currently supports 4 test frameworks: tempest, tobiko,
horizontests and ansibletests. The relevant variables required to
download their corresponding container images had to be configured
separately, but usually were configured with common values.
With this PR, default values for registry, namespace and tag can be
configured and they apply to the 4 test frameworks.
Grab the results JUnitXML file (if any) returned by kuttl tests
(they may not be configured to do so, so ignore the failures)
and copy them to a location that will be collected.
In some environments it looks like SSHd is taking a bit longer
to be ready and it rejects SSH attempts leading to a failure in the CI
run.
Add a hook to set up the OSP 17.1 overcloud ironic.
* ironic-python-agent glance images
* ironic provisioning network
* enroll ironic nodes

Jira: OSPRH-12423
Bumps the go_modules group with 1 update in the /roles/copy_container/files/copy-quay directory: [golang.org/x/crypto](https://github.com/golang/crypto).


Updates `golang.org/x/crypto` from 0.21.0 to 0.31.0
- [Commits](golang/crypto@v0.21.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
  dependency-group: go_modules
...

Signed-off-by: dependabot[bot] <[email protected]>
With recent changes in tobiko 0.8.4, the tobiko pods will download an
already customized image instead of using official ubuntu images that
were customized during tobiko execution.
Create a secret from the file and add it as a volume to the pod
Exposed a variable to pass the API endpoint without computing it too.
In NFV, we have some compute nodes belonging to the same deployment that are
different and we need 2 nodesets to be able to deploy
In order to allow users to run the shiftsatck role without being part of
the whole reproducer, below changes are needed:

- Add missing kubeconfig param
- Load .bashrc whle running the ansible-navigator. That's needed
for reading the envvars that will be used for writting the junit report.
- Enable the option to exclude artifacts to be gathered.
Horizon is not enabled by default in crc based deployment.
This PR adds hooks to enable horizon service
openstack-k8s-operators#2607
missed a bracket and that results into an error.

Related-Issue: OSPRH-12508
For use cases like ShiftOnStack the deployment may need to tweak the
hypervisor.
We did not have a way to tell the deployment how to reach the hypervisor
so this commit exposes the hypervisor Ansible instance to each host
and creates a hypervisors group in the generated inventory.
Till this commit we were not pinning the dependencies, resulting in
break jobs as soon as a dependency changed underneath without notice.
This commit pins all the dependencies to given known versions.
Antelope transitioned to unmaintained, update
build_test_packages role to reflect.
Execute all KUTTL tests without bailing out at the first error.
Instead make sure that the tests are executed and their JUnitXML
reports are collected if available.
Move the accept/fail condition at the end, following the same
pattern used elsewhere when running tests.
This PR exposes the resources parameter in for all test-operator
related CRs (Tempest, Tobiko, AnsibleTest, HorizonTest). This parameter
can be used to specify amount of resources the test pods spawned by the
test-operator should consume [1].

This commit intentionally unsets the value for Tempest test pods as
the propagation of the fix for Tempest memory leak bug [1] did not
reach the upstream openstack-tempest-all image yet. Once the fix is
in place the default value of cifmw_test_operator_tempest_resources
can be changed to {}.

[1] openstack-k8s-operators/test-operator#253
For being able to resolve shift-on-stack apps endpoints (routes)
hostnames from RHOSO pods it's required to add the DNS record entry
in then dnsmasq service running on the hypervisor.

It needs to be done after shift-on-stack cluster is installed as
the installer is creating the floating IP for the apps endpoints,
and the FIP is not known in advance.

For that purpose the cifmw_shiftstack_hypervisor param is passed so
the playbook running on the shiftstackclient pod is able to reach
the hypervisor.

This commit also adds the
cifmw_shiftstack_shiftstackclient_incluster_kubeconfig_dir param to
the shiftstack-qa playbook execution so it's able to find the RHOSO
kubeconfig location.
For inspection we need to add the ironic-python-agent to the
/var/lib/ironic/httpboot folder. Update the hook to install
the package on groups - osp-controllers and osp-underclouds, and
copy the file to the appropriate location.
Add a number of variables to control various aspects of the control
plane testing.  Using those new varibles we can configure control
plane testing settings from the job definition.

Increased the default time we wait for the last vm to be created and
destroyed as 5 minutes (the previous timeout) was slightly too short.
It's now 7 minutes.

Closes: https://issues.redhat.com/browse/OSPRH-12349
If the RPM name points to a URL we now use a custom plugin to fetch the
content. If the endpoint challenges the plugins with SPNEGO
authentication and a kerberos ticket is present the plugin will
authenticate itself using the ticket.
This patch adds support in the test-operator role to run
HorizonTest tests in the debug mode (same as already is
there for Tempest).
pablintino and others added 7 commits January 14, 2025 11:17
If the url to fetch using uri_request is not secured (it does not return
401/403) make requests_kerberos optional.
Some roles like repo_setup may have altered the state of
/etc/yum.repos.d/ without removing the EPEL rpm leading to yum/dnf to
think the repos are already installed.
Enforcing its deletion ensures the rpm always deploys the repository
files.
Since the role seems to be used from zuul without a nested playbook this
role needs to have all of its dependencies self contained and not
depending on global modules. To avoid copy/pasting the content of the
module a symlink is used.
This change is required for zuul to call this role from its executor.
After GA the amount of PRs that this repo receives is not that high, and
more importantly, almost all are legit ones that requires review, so
usually we are removing the draft immediately or asking the user
to do so.
With this PR [1] the test operator dropped the usage of OCP Jobs for
spawning of the test pods.

This change updates the test-operator role so that it works with the
new version of the test-operator that spawns test pods directly through
the OCP Pods object.

[1] openstack-k8s-operators/test-operator#266
Copy link
Contributor

openshift-ci bot commented Jan 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

openshift-ci bot commented Jan 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cescgina for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lpiwowar
Copy link
Contributor Author

Ups

@lpiwowar lpiwowar closed this Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.